该 指令集架构(ISA) 作为一项基础 抽象层级 ,也是软件与硬件之间的正式契约。尽管高级语言如C能隐藏复杂性,但ISA揭示了 体系结构状态——即处理器寄存器和内存的精确配置。
1. 体系结构状态
一个x86-64 CPU通过若干关键组件定义其状态:
- 程序计数器(%rip): 保存下一条指令的地址。
- 整数寄存器文件: 16个通用寄存器(例如,
%rax,%rbx)存储64位数值。 - 条件码: 用于控制流的标志位(ZF、SF、CF、OF)。
- 向量寄存器: 例如 YMM寄存器 (256位)用于SIMD操作。
2. 内存抽象
机器码将内存视为一个巨大的 字节可寻址数组。虽然x86-64支持64位虚拟地址,但当前实现通常使用48位地址空间($2^{48}$字节)。我们把数据大小分为 字 (16位), 双字 (32位),以及 四字 (64位)。
3. 演进与兼容性
受 摩尔定律驱动,英特尔已从 8086 演进到 Core i7 Haswell。ISA确保了 向后兼容性,使得旧版机器码可在现代多核、超线程硬件上运行。
main.py
TERMINALbash — 80x24
> Ready. Click "Run" to execute.
>
QUESTION 1
Practice Problem 2.52: Convert Format A bits
0 011 000 (Sign:1, Exp:3, Frac:3) to the closest value in Format B (Sign:1, Exp:2, Frac:4). Use round-to-even.0 01 00000 10 00000 11 10000 00 1111✅ Correct!
In Format A (bias 3), 011 is exponent 0 ($2^0$), so value is 1.0. In Format B (bias 1), exponent 0 is represented by 01. 1.0 is 0 01 0000.❌ Incorrect
Calculate the actual value first: $(-1)^s \times 1.f \times 2^{E}$. Format A bias is 3 ($2^{k-1}-1$).QUESTION 2
Which of the following assembly lines is valid for a 64-bit system?
movb %ebx, (%rax)movq %rax, $0x123movl %eax, (%rsp)movw %cl, %ax✅ Correct!
movl %eax, (%rsp) is valid; it moves a 32-bit double word into memory. Others fail due to size mismatches (ebx is 32-bit, movb is 8-bit) or trying to move into an immediate value.❌ Incorrect
Review register sizes: %al=8, %ax=16, %eax=32, %rax=64. The suffix must match the register size.QUESTION 3
If a processor requires 25 cycles for predictable branches and 45 cycles for random branches, what is the Branch Misprediction Penalty ($T_{MP}$)?
20 cycles
40 cycles
10 cycles
65 cycles
✅ Correct!
Using $T_{avg} = T_{OK} + p \times T_{MP}$, for $p=0.5$ (random), $45 = 25 + 0.5 \times T_{MP}$. Thus $20 = 0.5 \times T_{MP}$, so $T_{MP} = 40$.❌ Incorrect
The average time for random branching includes a 50% chance of misprediction.QUESTION 4
In the 'Guarded-Do' transformation of a while loop, what happens first?
The loop body executes once unconditionally.
An initial conditional branch checks if the loop should be skipped entirely.
The loop is converted into a switch statement.
The compiler uses a jump-to-middle strategy.
✅ Correct!
Guarded-Do uses an initial 'if' (the guard) to check the condition before entering the 'do-while' style loop body.❌ Incorrect
Guarded-Do differs from Jump-to-Middle by placing the conditional test at the start of the entire construct.QUESTION 5
What is the primary function of the
leaq instruction when not accessing memory?To perform fast arithmetic like
x + k*y.To clear the condition code registers.
To move data between XMM registers.
To sign-extend a byte to a quad word.
✅ Correct!
leaq (Load Effective Address) uses address-computation hardware to perform additions and shifts without referencing memory.❌ Incorrect
While it uses memory operand syntax, it does not dereference the address.Case Study: Complex Argument Handling
Analysis of Procedure Calls in ISO C99
You are reverse engineering a C program that uses complex numbers (ISO C99). A function
complex_add receives two 128-bit structures (representing real and imaginary parts) and returns a result. In x86-64, you notice that %rdi is used for an address, even though the C prototype doesn't show an explicit pointer argument.Q
How are large complex structures typically passed as arguments to functions?
Solution:
While the first six small arguments (integers/pointers) fit in registers like %rdi and %rsi, large structures (like 128-bit complex types) are often passed by copying them onto the stack, or by the caller passing a pointer to the structure in a register.
While the first six small arguments (integers/pointers) fit in registers like %rdi and %rsi, large structures (like 128-bit complex types) are often passed by copying them onto the stack, or by the caller passing a pointer to the structure in a register.
Q
How are complex values or large structs returned from a function in x86-64?
Solution:
For values larger than 64 bits that cannot fit in %rax, the caller allocates space on its own stack frame and passes the address of this space in %rdi. The callee then writes the return value directly to that memory location and returns the address in %rax.
For values larger than 64 bits that cannot fit in %rax, the caller allocates space on its own stack frame and passes the address of this space in %rdi. The callee then writes the return value directly to that memory location and returns the address in %rax.